National Repository of Grey Literature 142 records found  1 - 10nextend  jump to record: Search took 0.02 seconds. 
Word Sense Clustering
Haljuk, Petr ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor)
This Bachelor's thesis deals with the semantic similarity of words . It describes the design and the implementation of a system, which searches for the most similar words and measures the semantic similarity of words . The system uses the Word2Vec model from GenSim library . It learns the relations among words from CommonCrawl corpus .
Data Mining in Social Networks
Raška, Jiří ; Očenášek, Pavel (referee) ; Bartík, Vladimír (advisor)
This thesis deals with knowledge discovery from social media. This thesis is focused on feature based opinion mining from user reviews. In theoretical part were described methods of opinion mining and natural language processing. Main parts of this thesis were design and implementation of library for opinion mining based on Stanford Parser and lexicon WordNet. For feature identi cation was used dependency grammar, implicit features were mined with method CoAR and opinions were classi ed with supervised algorithm. Finally were given experiments with implemented library and examples of usage.
Natural Language Processing: Analysis of Information Technology Students’ Spoken Language
Stanković, Aleksandar ; Šťastná, Dagmar (referee) ; Ellederová, Eva (advisor)
Tato bakalářská práce se zabývá problematikou nových technologií umělé inteligence při zpracování přirozeného jazyka. Práce je rozdělena na teoretickou a analytickou část. Teoretická část přistupuje k problému rozdělením do tří kapitol: umělá inteligence a statistika, zpracování přirozeného jazyka a IBM Watson Natural Language Understanding. Každá z těchto kapitol je rozpracována včetně uvedení alespoň jednoho příkladu z praxe. V první kapitole je hlavním cílem vymezit teoretický rámec umělé inteligence a jejích postupů, zatímco ve druhé kapitole je vysvětlena problematika zpracování přirozeného jazyka a jeho primární funkce včetně jeho vztahu k samotné umělé inteligenci. Cílem třetí kapitoly je představit porozumění přirozenému jazyku jako primární nástroj pro analýzu, která je realizována v analytické části práce. Analytická část se zabývá analýzou mluveného jazyka studentů prostřednictvím různých metod. Transkripce shromážděných vzorků videí je provedena strojovým překladem jako aplikací zpracování přirozeného jazyka, zatímco textový výstup je analyzován prostřednictvím nástroje porozumění přirozenému jazyku. V analytické části, která popisuje výzkumnou metodologii, prezentuje a interpretuje výsledky výzkumu, jsou využívány aplikované znalosti z teoretické části práce.
Authorship Identification
Fabiánek, Ondřej ; Škoda, Petr (referee) ; Smrž, Pavel (advisor)
This bachelor's thesis deals with authorship identification based on knowledge of author's previous texts. The aim is to analyze existing methods of authorship attribution and create a system, which is capable of highly successful authorship identification. The system is based on a multivariate analysis and specializes at English books. Part of the solution is also a graphic user interface.
Automatic Humor Evaluation
Katrňák, Josef ; Ondřej, Karel (referee) ; Dočekal, Martin (advisor)
The aim of this thesis is to create a system for automatic humor evaluation. The system allow to predict humor and category for english input. The main essence is to create a classifier and train the model with the created datasets to get the best possible results. The classifier architecture is based on neural networks. The system also includes a web user interface for communication with the user. The result is a web application linked to a classifier that allows user input to be evaluated and user feedback to be provided.
Named Entity Recognition
Rylko, Vojtěch ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor)
In this master thesis are described the history and theoretical background of named-entity recognition and implementation of the system in C++ for named entity recognition and disambiguation. The system uses local disambiguation method and statistics generated from the  Wikilinks web dataset. With implemented system and with alternative implementations are performed various experiments and tests. These experiments show that the system is sufficiently successful and fast. System participates in the Entity Recognition and Disambiguation Challenge 2014.
Automatic Link Detection in Parts of Audiovisual Documents
Sychra, Marek ; Černocký, Jan (referee) ; Szőke, Igor (advisor)
This paper deals with topic detection. Specifically link detection - finding similarities amongst a group of short documents according to their topic and story segmentation - finding borders between two topically different parts in a large document. The main motivation for research was practical application with the use of presentation materials from lectures at FIT (linking parts of different lectures and courses). The solution of link detection is achieved by text and word analysis, which includes learning the meaning and importance of each word. Story segmentation uses this while searching for the boundaries. Both parts of the problem (link detection, story segmentation) gave great results while testing with a standard dataset (world news reports). During evaluation of lecture processing the success rate was lower, but still good.
Extraction of Relations among Named Entities Mentioned in Text
Voháňka, Ondřej ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor)
This bachelor's thesis deals with relation extraction. Explains basic knowledge, that is necessary for creating an extraction system. Then describes design, implementation and comparison of three systems, which works differently. Following methods were used: regular expressions, NER, parser. 
Email spam filtering using artificial intelligence
Safonov, Yehor ; Uher, Václav (referee) ; Kolařík, Martin (advisor)
In the modern world, email communication defines itself as the most used technology for exchanging messages between users. It is based on three pillars which contribute to the popularity and stimulate its rapid growth. These pillars are represented by free availability, efficiency and intuitiveness during exchange of information. All of them constitute a significant advantage in the provision of communication services. On the other hand, the growing popularity of email technologies poses considerable security risks and transforms them into an universal tool for spreading unsolicited content. Potential attacks may be aimed at either a specific endpoints or whole computer infrastructures. Despite achieving high accuracy during spam filtering, traditional techniques do not often catch up to rapid growth and evolution of spam techniques. These approaches are affected by overfitting issues, converging into a poor local minimum, inefficiency in highdimensional data processing and have long-term maintainability issues. One of the main goals of this master's thesis is to develop and train deep neural networks using the latest machine learning techniques for successfully solving text-based spam classification problem belonging to the Natural Language Processing (NLP) domain. From a theoretical point of view, the master's thesis is focused on the e-mail communication area with an emphasis on spam filtering. Next parts of the thesis bring attention to the domain of machine learning and artificial neural networks, discuss principles of their operations and basic properties. The theoretical part also covers possible ways of applying described techniques to the area of text analysis and solving NLP. One of the key aspects of the study lies in a detailed comparison of current machine learning methods, their specifics and accuracy when applied to spam filtering. At the beginning of the practical part, focus will be placed on the e-mail dataset processing. This phase was divided into five stages with the motivation of maintaining key features of the raw data and increasing the final quality of the dataset. The created dataset was used for training, testing and validation of types of the chosen deep neural networks. Selected models ULMFiT, BERT and XLNet have been successfully implemented. The master's thesis includes a description of the final data adaptation, neural networks learning process, their testing and validation. In the end of the work, the implemented models are compared using a confusion matrix and possible improvements and concise conclusion are also outlined.
Computer as an Intelligent Partner in the Word-Association Game Codenames
Jareš, Petr ; Fajčík, Martin (referee) ; Smrž, Pavel (advisor)
This thesis solves a determination of semantic similarity between words. For this task is used a combination of predictive model fastText and count based method Pointwise Mutual Information. Thesis describes a system which utilizes semantic models for ability to substitue a player in a word association game Codenames. The system has implemented game strategy enabling use of context information from the game progression to benefit his own team. The system is able to substitue a player in both team roles.

National Repository of Grey Literature : 142 records found   1 - 10nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.